Message vs. messenger effects on cross-modal matching for spoken phrases

نویسندگان

  • Catherine T. Best
  • Christian Kroos
  • Karen E. Mulak
  • Shaun Halovic
  • Mathilde Fort
  • Christine Kitamura
چکیده

A core issue in speech perception and word recognition research is the nature of information perceivers use to identify spoken utterances across indexical variations in their phonetic details, such as talker and accent differences. Separately, a crucial question in audio-visual research is the nature of information perceivers use to recognize phonetic congruency between the audio and visual (talking face) signals that arise from speaking. We combined these issues in a study examining how differences between connected speech utterances (messages) versus between talkers and accents (messenger characteristics) contribute to recognition of crossmodal articulatory congruence between audio-only (AO) and video-only (VO) components of spoken utterances. Participants heard AO phrases in their native regional English accent or another English accent, and then saw two synchronous VO displays of point-light talking faces from which they had to select the one that corresponded to the audio target. The incorrect video in each pair was either the same or a different phrase as the audio target, produced by the same or a different talker, who spoke in either the same or a different English accent. Results indicate that cross-modal articulatory correspondence is more accurately and quickly detected for message content than for messenger details, suggesting that recognising the linguistic message is more fundamental than messenger features is to cross-modal detection of audio-visual articulatory congruency. Nonetheless, messenger characteristics, especially accent, affected performance to some degree, analogous to recent findings in AO speech research.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grounding Natural Spoken Language Semantics in Visual Perception and Motor Control

A characteristic shared by most approaches to natural language understanding and generation is the use of symbolic representations of word and sentence meanings. Frames and semantic nets are two popular current approaches. Symbolic methods alone are inadequate for applications such as conversational robotics that require natural language semantics to be linked to perception and motor control. T...

متن کامل

Cross-modality matching of linguistic and emotional prosody

Talkers can express different meanings or emotions without changing what is said by changing how it is said (by using both auditory and/or visual speech cues). Typically, cue strength differs between the auditory and visual channels: linguistic prosody (expression) is clearest in audition; emotional prosody is clearest visually. We investigated how well perceivers can match auditory and visual ...

متن کامل

The Use of Domain-initia in Segmentation of Continuo

Prosodic structure in English speech is signalled, in part, by stronger articulation of consonants at the onset of intonational phrases (IPs) than of consonants that are IP-medial. In two cross-modal priming experiments, American English listeners heard sentences and decided whether visual letter strings, presented during the sentences, were real words. We manipulated sentence type (either no I...

متن کامل

The Use of Domain-initial Strengthening in Segmentation of Continuous English Speech

Prosodic structure in English speech is signalled, in part, by stronger articulation of consonants at the onset of intonational phrases (IPs) than of consonants that are IP-medial. In two cross-modal priming experiments, American English listeners heard sentences and decided whether visual letter strings, presented during the sentences, were real words. We manipulated sentence type (either no I...

متن کامل

Age differences in electrophysiological correlates of cross-modal phrasal interpretation

Research shows that older adults may be more sensitive than young adults to prosody, although performance varies depending on task requirements. Here we used electroencephalography to examine responses to simple phrases produced with an Early or Late boundary, presented with matching or mismatching visual displays. While some older adults successfully detected prosodic mismatches, many failed t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015